1,415 research outputs found

    Evaluating Two-Stream CNN for Video Classification

    Full text link
    Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two-stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-of-the-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction.Comment: ACM ICMR'1

    Improving Small Object Proposals for Company Logo Detection

    Get PDF
    Many modern approaches for object detection are two-staged pipelines. The first stage identifies regions of interest which are then classified in the second stage. Faster R-CNN is such an approach for object detection which combines both stages into a single pipeline. In this paper we apply Faster R-CNN to the task of company logo detection. Motivated by its weak performance on small object instances, we examine in detail both the proposal and the classification stage with respect to a wide range of object sizes. We investigate the influence of feature map resolution on the performance of those stages. Based on theoretical considerations, we introduce an improved scheme for generating anchor proposals and propose a modification to Faster R-CNN which leverages higher-resolution feature maps for small objects. We evaluate our approach on the FlickrLogos dataset improving the RPN performance from 0.52 to 0.71 (MABO) and the detection performance from 0.52 to 0.67 (mAP).Comment: 8 Pages, ICMR 201

    Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images

    Full text link
    We address the problem of fine-grained action localization from temporally untrimmed web videos. We assume that only weak video-level annotations are available for training. The goal is to use these weak labels to identify temporal segments corresponding to the actions, and learn models that generalize to unconstrained web videos. We find that web images queried by action names serve as well-localized highlights for many actions, but are noisily labeled. To solve this problem, we propose a simple yet effective method that takes weak video labels and noisy image labels as input, and generates localized action frames as output. This is achieved by cross-domain transfer between video frames and web images, using pre-trained deep convolutional neural networks. We then use the localized action frames to train action recognition models with long short-term memory networks. We collect a fine-grained sports action data set FGA-240 of more than 130,000 YouTube videos. It has 240 fine-grained actions under 85 sports activities. Convincing results are shown on the FGA-240 data set, as well as the THUMOS 2014 localization data set with untrimmed training videos.Comment: Camera ready version for ACM Multimedia 201

    Modeling the Grain Cleaning Process of a Stationary Sorghum Thresher

    Full text link
    Rosana G. Moreira, Editor-in-Chief; Texas A&M UniversityThis is a paper from International Commission of Agricultural Engineering (CIGR, Commission Internationale du Genie Rural) E-Journal Volume 8 (2006): Modeling the Grain Cleaning Process of a Stationary Sorghum Thresher. Manuscript PM 06 012. Vol. VIII. August, 2006

    Efficient On-the-fly Category Retrieval using ConvNets and GPUs

    Full text link
    We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval - where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets. We make three contributions: (i) we present an evaluation of state-of-the-art image representations for object category retrieval over standard benchmark datasets containing 1M+ images; (ii) we show that ConvNets can be used to obtain features which are incredibly performant, and yet much lower dimensional than previous state-of-the-art image representations, and that their dimensionality can be reduced further without loss in performance by compression using product quantization or binarization. Consequently, features with the state-of-the-art performance on large-scale datasets of millions of images can fit in the memory of even a commodity GPU card; (iii) we show that an SVM classifier can be learnt within a ConvNet framework on a GPU in parallel with downloading the new training images, allowing for a continuous refinement of the model as more images become available, and simultaneous training and ranking. The outcome is an on-the-fly system that significantly outperforms its predecessors in terms of: precision of retrieval, memory requirements, and speed, facilitating accurate on-the-fly learning and ranking in under a second on a single GPU.Comment: Published in proceedings of ACCV 201

    Context-Aware Embeddings for Automatic Art Analysis

    Full text link
    Automatic art analysis aims to classify and retrieve artistic representations from a collection of images by using computer vision and machine learning techniques. In this work, we propose to enhance visual representations from neural networks with contextual artistic information. Whereas visual representations are able to capture information about the content and the style of an artwork, our proposed context-aware embeddings additionally encode relationships between different artistic attributes, such as author, school, or historical period. We design two different approaches for using context in automatic art analysis. In the first one, contextual data is obtained through a multi-task learning model, in which several attributes are trained together to find visual relationships between elements. In the second approach, context is obtained through an art-specific knowledge graph, which encodes relationships between artistic attributes. An exhaustive evaluation of both of our models in several art analysis problems, such as author identification, type classification, or cross-modal retrieval, show that performance is improved by up to 7.3% in art classification and 37.24% in retrieval when context-aware embeddings are used

    Investigating Grain Separation and Cleaning Efficiency Distribution of a Conventional Stationary Rasp-bar Sorghum Thresher

    Get PDF
    A stationary grain thresher was developed and used to study grain separation and cleaning efficiency distribution of the cleaning unit, fractionated by sieve and horizontal air stream, along the sieve length. The influence of feed rate, m, air speed, VA and sieve oscillation frequency, FS on cleaning efficiency of sorghum was explored. Grain separation along the sieve can be divided into three sections: increasing, peak and decreasing sections. Results showed that cleaning efficiency decreased with increasing sieve oscillations frequency and feed rate respectively. Cleaning loss increased with increasing sieve oscillation frequency, feed rate and air speed

    Mathematical modeling of blanched and unblanched solar dried ginger rhizome varieties.

    Get PDF
    This research examines the mathematical modelling of blanched and unblanched solar dried ginger rhizome varieties. The Umudike ginger I and II (UG I and UG II) were blanched with an Electric water bath in the Soil and Water Laboratory, Agricultural and Bioresources Engineering Department, Michael Okpara University of Agriculture Umudike, Abia State. The samples UG I and UG II, were blanched for 3, 6, and 9 minutes at 50℃ respectively. Each samples with the treatment were subjected to active solar drying in sequence. Also, blanched and unblanched UG I and UG II were subjected to active solar drying. The treatment was carried out at 10mm thickness for UG I and UG II rhizome. There were ten different mathematical drying models compared based on the correlation coefficient, mean bias error, root mean square error and reduced chi-square method. The various models used are efficient thin layer drying models and its best fitted model varies due to the blanched and unblanched treatments of UG I and UG II. It was also used to validate and predict equations for all the treatments. The Henderson and Pabis model was recommended for predicting the drying characteristics of blanched and unblanched UG I and UG II ginger rhizomes

    Assessment of Workers’ Level of Exposure to Work-Related Musculoskeletal Discomfort in Dewatered Cassava Mash Sieving Process

    Get PDF
    This study was undertaken to assess the level of exposure of processors to work-related musculoskeletal disorder when using the locally developed traditional sieve in the sieving process. Quick ergonomic checklist (QEC)  involving the researcher’s and the processors’ assessment using the risk assessment checklist, was used in this assessment and data was obtained from a sample of one hundred and eight (108) processors randomly selected from three senatorial districts of Rivers State. Thirty-six processors from each zone comprising of 14 males and 22 females, were selected., and assessed on the bases of their back, shoulder/arm, wrist/hand and neck posture and frequency of movement during traditional sieving process. The result of the assessment showed that the highest risk of discomfort occurred at the region of the wrist/hand, followed by back, shoulder/arm, and neck. The posture used in the sieving process exposed the processors, not only to the discomfort of pain but also put them at high risk of musculoskeletal disorder at indicated by  a high level of percentage exposure of 66% QEC rating. The result indicated a need for immediate attention and change to an improved method that will reduce the discomfort on the body parts assessed. identified parts
    • …
    corecore